{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Classes and Objects\n", "Reading material: [tutorialspoint](http://www.tutorialspoint.com/python/python_classes_objects.htm)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "A `class` is a user-defined variable type that groups functions and data, which can be access with the `.` (dot) operator. A `class` serves as a blueprint for objects." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class Complex :\n", " '''class representing complex numbers. supports basic complex arithmetic'''\n", " def __init__(self, real, imag=0.0):\n", " self.real = real # instance variable\n", " self.imag = imag # instance variable\n", "\n", " def add(self, other):\n", " return Complex(self.real + other.real, self.imag + other.imag)\n", "\n", " def sub(self, other):\n", " return Complex(self.real - other.real, self.imag - other.imag)\n", "\n", " def mul(self, other):\n", " return Complex(self.real*other.real - self.imag*other.imag,\n", " self.imag*other.real + self.real*other.imag)\n", "\n", " def display(self):\n", " print('{:.2f}+{:.2f}i'.format(self.real, self.imag))\n", "\n", "c1 = Complex(1.1,-0.3) #directly create Complex object/instance\n", "c2 = Complex(5.5,2) #directly create Complex object/instance\n", "c3 = c1.mul(c2) #indirectly create Complex object/instance\n", "c3.display()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We write the `class Complex` once and create multiple `Complex` objects. In this sense, a `class` is a blueprint." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Notes:\n", "- __Instance variables__ are not listed outside the methods. You initialize them inside methods.\n", "- `self` refers to the current object. `self` must be the first parameter methods. You must use `self` to refer to instance variables.\n", "- The constructor or initialization method `__init__` is called when you create a new instance of the class." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Magic methods and overloading operators\n", "\n", "Magic methods are special methods that add \"magic\" to your classes. They are surrounded by double underscores (e.g. `__init__` or `__add__`). \n", "We read `__` as \"dunder\" which is short for \"double under\".\n", "Overview of all of Python's magic methods: http://minhhh.github.io/posts/a-guide-to-pythons-magic-methods \n", "\n", "The following example implements `__add__`, `__sub__`, and `__mul__` so we can use the arithmetic operators. It also implements `__str__` so we can `print` the object meaningfully." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class Complex:\n", " '''this is a class demo'''\n", " def __init__(self, real, imag=0.0):\n", " self.real = real\n", " self.imag = imag\n", "\n", " def __add__(self, other):\n", " return Complex(self.real + other.real, self.imag + other.imag)\n", "\n", " def __sub__(self, other):\n", " return Complex(self.real - other.real, self.imag - other.imag)\n", "\n", " def __mul__(self, other):\n", " return Complex(self.real*other.real - self.imag*other.imag,\n", " self.imag*other.real + self.real*other.imag)\n", "\n", " def __str__(self):\n", " return '{:.2f}+{:.2f}i'.format(self.real, self.imag)\n", "\n", "\n", "c1 = Complex(2.3,10)\n", "c2 = Complex(5.2,-2.9)\n", "print(c1 * c2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Imagine performing complex arithmetic without a `class`. You would have to carry around pairs of real numbers, and performing arithmetic would be much more error-prone.\n", "\n", "Having `class Complex` hold two real numbers and provide methods operating on the data is convenient." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Private Variables\n", "\n", "Variables and method beginning with `__` (dunder) dunder are by convention understood to be private. __Private__ variables and methods should only be accessed within the `class`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Example: Polynomial class\n", "\n", "The following `class` implements a univariate polynomial real numbers." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class Polynomial :\n", " '''\n", " This class implements a univariate polynomial.\n", " Arithmetic operations such as + - are supported. (* is an exercise)\n", " '''\n", " \n", " def __init__(self, init = 0) :\n", " self.__poly_coeff = [] # list storing coefficients (private instance variable)\n", "\n", " # Creates constant polynomial p(x) = init\n", " if isinstance(init, int) or isinstance(init, float) :\n", " self.__poly_coeff = [init]\n", " \n", " # Copy the coefficients from given list\n", " # init[n] = 'n-th coefficient'\n", " elif isinstance(init, list) :\n", " self.__poly_coeff = init.copy()\n", " \n", " # Copy the given Polynomial instance\n", " elif isinstance(init, Polynomial) :\n", " for n in range(init.degree()+1) :\n", " self.set_coeff(n, init.get_coeff(n))\n", " \n", "\n", " # Returns the degree of Polynomial\n", " def degree(self) :\n", " return max([0]+[n for n,c in enumerate(self.__poly_coeff) if c != 0.0])\n", "\n", " # Sets the coefficient of given degree term\n", " def set_coeff(self, deg, new_coeff) :\n", " if len(self.__poly_coeff) <= deg :\n", " self.__poly_coeff += [0.0 for _ in range(deg + 1 - len(self.__poly_coeff))]\n", " self.__poly_coeff[deg] = new_coeff\n", " \n", " # Returns the coefficient of given degree term\n", " def get_coeff(self, deg) :\n", " return 0 if self.degree() < deg else self.__poly_coeff[deg]\n", " \n", " \n", " # -self\n", " def __neg__(self) :\n", " result = Polynomial()\n", " for n in range(self.degree() + 1) :\n", " result.set_coeff(n, -self.__poly_coeff[n])\n", " return result\n", " \n", " # self + poly2\n", " def __add__(self, poly2) :\n", " result = Polynomial(self)\n", " result += poly2\n", " return result\n", " \n", " # self - poly2\n", " def __sub__(self, poly2) :\n", " result = Polynomial(self)\n", " result -= poly2\n", " return result\n", " \n", " # Overload += (self += poly2)\n", " def __iadd__(self, poly2) :\n", " poly2 = Polynomial(poly2)\n", " for n in range(max(self.degree(),poly2.degree()) + 1) :\n", " self.set_coeff(n, self.get_coeff(n) + poly2.get_coeff(n))\n", " return self\n", " \n", " # Overload -=\n", " def __isub__(self, poly2) :\n", " return (self.__iadd__(-poly2))\n", " \n", " # Operators with Polynomial instance on the right\n", " __radd__ = __add__ # other + self\n", " \n", " # poly2 - self\n", " def __rsub__(self, poly2) :\n", " return -Polynomial(self) + poly\n", "\n", " # Evaluation of polynomial at x : p(x)\n", " def __call__(self,x):\n", " return sum([self.get_coeff(n)*(x**n) for n in range(self.degree() + 1)])\n", " \n", " #returns algebraic formula of polynomial as a string\n", " def __str__(self):\n", " coeff_list = [self.get_coeff(n) for n in range(self.degree() + 1) ]\n", " \n", " expr = ''\n", " # Generate polynomial expression\n", " for n in range(self.degree(), 0, -1) :\n", " if coeff_list[n] == 0 : \n", " pass\n", " elif coeff_list[n] == 1 :\n", " expr += '+ x^{0} '.format(n)\n", " elif coeff_list[n] == -1 :\n", " expr += '- x^{0} '.format(n)\n", " elif coeff_list[n] < 0 :\n", " expr += '- {0:.2f}x^{1} '.format(- coeff_list[n], n)\n", " pass\n", " else :\n", " expr += '+ {0:.2f}x^{1} '.format(coeff_list[n], n)\n", " \n", " if coeff_list[0] < 0 :\n", " expr += '- ' + '{:.2f}'.format(- coeff_list[0])\n", " elif coeff_list[0] > 0 :\n", " expr += '+ ' + '{:.2f}'.format(coeff_list[0])\n", " \n", " if expr[:2] == \"+ \":\n", " return expr[2:]\n", " elif expr[:2] == \"- \":\n", " return \"-\" + expr[2:]\n", "\n", "\n", "# Test code\n", "p1 = Polynomial()\n", "p1.set_coeff(0, 1.2)\n", "p1.set_coeff(3, 2.2)\n", "p1.set_coeff(7, -9.0)\n", "p1.set_coeff(7, 0.0)\n", "# # degree of polynomial is now 3\n", "print(p1)\n", "print(-p1) #call negation operator\n", "\n", "print(p1.degree())\n", "\n", "p2 = Polynomial([1, 1.3])\n", "# print(p2.get_coeff(0))\n", "# print(p2.get_coeff(1))\n", "# print(p2.get_coeff(2)) #should be 0\n", "# print(p2.get_coeff(3)) #should be 0\n", "# print(p2.get_coeff(4)) #should be 0\n", "# print(p2.get_coeff(5)) #should be 0\n", "\n", "print(p2 + p1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Access the __docstring__ of a class by accessing the `__doc__` attribute of the `class`. By convention, the __docstring__ provides a brief description of the `class`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "print(Polynomial.__doc__)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use `dir` or access the `__dict__` attribute to see the functionality a `class` provides." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(Polynomial.__dict__)\n", "print(dir(Polynomial))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Duck typing\n", "\n", "The following function `sum_all` sums numbers of a list." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def sum_all(lst):\n", " ret = None\n", " for elem in lst:\n", " if ret is None:\n", " ret = elem\n", " else:\n", " ret = ret + elem\n", " return ret\n", " \n", " \n", "print(sum_all([1,2,3]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But wait, `lst` need not be a list and the elements of `lst` need not be numbers. \"Sums numbers of a list\" does not fully describe the capability of `sum_all`. \n", "\n", "Really, you can use `sum_all(lst)` if you can iterate through the elements of `lst` with a for loop (i.e., `lst` is an \"iterable\" as we define later) and you can use `+` with the elements of `lst` (i.e., the elements of `lst` are objects with the `__add__` method)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "lst1 = ['Python was named after ', 'the British TV series \"Monty Python.\" ']\n", "lst2 = ['The Dutch creator of Python, Guido van Rossum, seems to have a British sense of humor.']\n", "\n", "# print(sum_all((lst1,lst2))) # list of strings\n", "\n", "c1 = Complex(1,2)\n", "c2 = Complex(3,4)\n", "c3 = Complex(-5,0)\n", "\n", "print(sum_all({c1,c2,c3})) # tuple of Complex" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the context of logic (논리학), the following saying describes a form of abductive reasoning:\n", "\n", "> \"If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck.\"\n", "\n", "In the context of programming, __duck typing__ refers to the practice of caring about what the object can do, rather than what it is." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Consider the following implementation of gradient descent for\n", "$$\n", "\\begin{array}{ll}\n", "\\underset{x\\in\\mathbb{R}^n}{\\mbox{minimize}}&\n", "\\frac{1}{2}\\|Ax-b\\|^2\n", "\\end{array}\n", "$$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#A = m x n matrix\n", "b = np.array(m)\n", "\n", "x = np.zeros(n)\n", "for _ in range(10000) :\n", " x = x - alpha*A.T@(A@x-b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What must `A` be able to __do__? (Note, we are not asking what `A` __is__.)\n", "\n", "- `A` must have `__matmul__(self, np_vector)` must be implemented so that `A@x` with a numpy vector `x` is allowed.\n", "- `A` must have instance variable `T` so that `A.T` is allowed.\n", "\n", "There are cases where you can implement matrix-vector multiplication with $A$ and $A^T$, but forming the $m\\times n$ matrix is inefficient (e.g. sparse matrix, FFT, and convolution). In these cases, you can provide objects supporting matrix-vector products without directly forming the full numpy matrix." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Duck typing is Pythonic. In strongly-typed languages like C++ and Java, duck typing is mostly impossible, and you are required to use inheritance or function pointers to achieve similar effects." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Inheritance\n", "Because Python is not a strongly-typed language, inheritance is not used to provide type-safety. Rather, inheritance is used to re-use certain features of another class and to build on top of it." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class Matrix:\n", " def __init__(self, dim, arr):\n", " self.h = dim[0] # height\n", " self.w = dim[1] # width\n", " self.elem_list = arr[:] # make copy\n", " \n", " def __add__(self, RHS):\n", " return Matrix((self.h,self.w), [self.elem_list[i] + RHS.elem_list[i] for i in range(self.h*self.w)])\n", "\n", " def __mul__(self, RHS):\n", " e_list = [0] * self.h * RHS.w\n", " for i in range(self.h):\n", " for k in range(self.w):\n", " for j in range(RHS.w):\n", " e_list[i*RHS.w+j] += self.elem_list[i*self.w+k] * RHS.elem_list[k*RHS.w+j]\n", " return Matrix((self.h,RHS.w), e_list)\n", " \n", " def __str__(self):\n", " s = \"[\"\n", " for i in range(self.h):\n", " for j in range(self.w):\n", " s += str(self.elem_list[i*self.w+j]) + \" \"\n", " s += \"\\n\"\n", " s = s[:-2] + \"]\"\n", " return s\n", " \n", "class SquareMatrix(Matrix):\n", " def det(self):\n", " #some formula for computing the determinant\n", " pass\n", " def inverse(self):\n", " #some formula for computing the inverse\n", " pass\n", " \n", "m1 = Matrix((3,2),[1,6,2,6,3,5])\n", "m2 = Matrix((2,3),[1,2,2,1,1,2])\n", "print(m1)\n", "print(m2)\n", "print(str(m1*m2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## For loop and iterables\n", "\n", "Container objects can be looped over using a for loop, but how?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for element in [1, 2, 3]:\n", " print(element)\n", " \n", "for element in (1, 2, 3):\n", " print(element)\n", " \n", "for element in {1, 2, 3}:\n", " print(element)\n", " \n", "for key in {'one':1, 'two':2}:\n", " print(key) # iterate over keys but not values\n", " \n", "for char in \"ABC\":\n", " print(char)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Also, what is `range(n)`?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for ind in range(5):\n", " print(ind)\n", "print(range(5))\n", "print(type(range(5)))\n", "print(dir(range(5)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Generally, you can use for loops with __iterables__, which are objects that provide an __iterator__ through the method `__iter()__`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(range(5).__iter__())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An __iterator__ provides access to the elements with the method `__next__()`.\n", "\n", "The following loop manually iterates through `range(5)`, an iterable." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "itr = range(5).__iter__()\n", "while True:\n", " print(itr.__next__())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Usually, there is no need to directly call `__iter__`; it is better to use a `for` loop. The example above is for learning purposes.\n", "\n", "The end of the iterator is signaled using an exception." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "itr = range(5).__iter__()\n", "while True:\n", " try:\n", " print(itr.__next__())\n", " except StopIteration:\n", " break" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We won't spend time on exceptions and exception handling with try-except in this class, so don't worry if the above example doesn't make sense." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "itr = iter(\"Hello\")\n", "while True:\n", " try:\n", " print(next(itr))\n", " except StopIteration:\n", " break" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Custom iterable example." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class Sentence:\n", " def __init__(self, sentence):\n", " self.sentence = sentence\n", " \n", " def __iter__(self):\n", " return SentenceIter(self.sentence)\n", "\n", "class SentenceIter:\n", " def __init__(self, sentence):\n", " self.words = sentence.split() # returns a list of words separated by spaces\n", " self.index = 0\n", "\n", " def __next__(self):\n", " if self.index >= len(self.words):\n", " raise StopIteration # StopIteration exception signals end of iterator\n", " index = self.index\n", " self.index += 1\n", " return self.words[index]\n", "\n", "\n", "\n", "my_sentence = Sentence('This is a test')\n", "# for word in my_sentence:\n", "# print(word)\n", "\n", "stIter = iter(my_sentence)\n", "\n", "print(next(stIter))\n", "print(next(stIter))\n", "print(next(stIter))\n", "print(next(stIter))\n", "print(next(stIter)) # out of elements" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Iterators are do not have to end. The following is an example with the Fibonacci sequence." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class Fibo:\n", " def __init__(self):\n", " pass\n", " \n", " def __iter__(self_):\n", " return FiboIter()\n", "\n", "class FiboIter:\n", " def __init__(self):\n", " self.index = -1\n", " \n", " def __next__(self):\n", " self.index += 1\n", " if self.index == 0:\n", " return 0\n", " elif self.index == 1:\n", " self.prev, self.curr = 0, 1\n", " return 1\n", " else:\n", " nxt = self.prev + self.curr\n", " self.prev, self.curr = self.curr, nxt\n", " return self.curr\n", "\n", "for num in Fibo():\n", " if num > 100:\n", " break\n", " print(num)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is actually common practice to have one single class represent both the iterable and its iterator.\n", "\n", "The first of the following two examples was inspired and copied from [Corey Schafer](https://www.youtube.com/channel/UCCezIgC97PvUuR4_gbFUs5g)'s Youtube channel." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class Sentence:\n", " def __init__(self, sentence):\n", " self.sentence = sentence\n", " self.words = sentence.split()\n", " self.index = 0\n", " \n", " def __iter__(self):\n", " return self\n", "\n", " def __next__(self):\n", " if self.index >= len(self.words):\n", " raise StopIteration # StopIteration exception signals end of iterator\n", " index = self.index\n", " self.index += 1\n", " return self.words[index]\n", "\n", "\n", "my_sentence = Sentence('This is a test')\n", "# for word in my_sentence:\n", "# print(word)\n", "\n", "\n", "print(next(my_sentence))\n", "print(next(my_sentence))\n", "print(next(my_sentence))\n", "print(next(my_sentence))\n", "# print(next(my_sentence)) # out of elements\n", "\n", "\n", "\n", "class Fibo:\n", " def __init__(self):\n", " self.index = -1\n", " \n", " def __iter__(self):\n", " return self\n", " \n", " def __next__(self):\n", " self.index += 1\n", " if self.index == 0:\n", " return 0\n", " elif self.index == 1:\n", " self.prev, self.curr = 0, 1\n", " return 1\n", " else:\n", " next = self.prev + self.curr\n", " self.prev, self.curr = self.curr, next\n", " return self.curr\n", "\n", "\n", "for num in Fibo():\n", " if num > 100:\n", " break\n", " print(num)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Context manager and with\n", "\n", "A __context manager__ is an object that defines the runtime context to be established when executing a `with` statement. It provides `__enter__` and `__exit__` methods. You use context manager with `with` statements." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class c_manager :\n", " def __init__(self):\n", " print(\"Manager constructred\")\n", " def __enter__(self):\n", " print(\"Context begins\")\n", " print(\"------------------------------------------------\")\n", " def __exit__(self, exc_type, value, traceback):\n", " print(\"------------------------------------------------\")\n", " print(\"Context ends\")\n", "\n", "with c_manager():\n", " print(\"hello\")\n", " print(\"Let's do some stuff here.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Example: Using a context manager to measure runtime of a code block" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "from time import time\n", "\n", "class Timer :\n", " def __init__(self, description):\n", " self.description = description\n", " def __enter__(self):\n", " self.start = time()\n", " def __exit__(self, exc_type, value, traceback):\n", " self.end = time()\n", " print(f\"{self.description}: {self.end - self.start:.2f}s\")\n", "\n", "\n", "with Timer(\"List Comprehension Example\"):\n", " print(\"We do stuff here\")\n", " s = [x for x in range(10000000)]\n", " print(\"We did stuff here\") " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# NumPy\n", "\n", "__NumPy__ is the numerical computation library of Python. \n", "When performing numerical computation, `numpy` arrays are far superior than raw Python `list`s." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## numpy arrays\n", "\n", "`numpy.array(...)` creates a `numpy` array from a Python list." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "\n", "a = np.array([1,2,3], dtype='int32') #dtype specifies data type \n", "\n", "b = np.array([1,2,3], dtype='float64')\n", "\n", "c = np.array([[9.0,8.0,7.0],[6.0,5.0,4.0]])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# dimension of np array\n", "# print(a.ndim)\n", "\n", "# shape of np array\n", "# print(a.shape)\n", "\n", "# number of elements in np array\n", "# print(c.size)\n", "\n", "# type of elements\n", "# print(c.dtype)\n", "\n", "# size of elements in bytes\n", "# print(c.itemsize)\n", "\n", "# total size of np array in bytes\n", "# print(c.nbytes)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this lecture, an \"array\" can have 1, 2, 3, or more dimensions, while a \"matrix\" specifically is 2-dimensional." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Creating basic arrays" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# A = np.zeros((2,3)) # all 0 array\n", "# A = np.ones((4,2,2)) # all 1 array\n", "\n", "# b = np.ones(5) # ndim = 1\n", "# b = np.ones((5,)) # same 1D array\n", "# print(b)\n", "\n", "# # np.random uses different notation for specifying dimensions\n", "# A = np.random.rand(4,2) # random numbers between 0 and 1\n", "# A = np.random.randn(5) # random standard normal\n", "# A = np.random.randint(-4,8, size=(3,3)) # random integers\n", "\n", "# A = np.identity(5) # identity matrix\n", "\n", "# np.arange(...) returns numpy array; range(...) returns iterable \n", "# arange is short for array-range; unrelated to verb arrange\n", "x = np.arange(1,8,1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Reorganizing arrays" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "A = np.array([[1,2,3,4],[5,6,7,8]])\n", "# print(A.reshape((4,2)))\n", "# print(A.reshape((4,-1))) # as many columns as needed to fit elements\n", "\n", "v1 = np.array([1,2,3,4])\n", "v2 = np.array([5,6,7,8])\n", "print(np.vstack([v1,v2])) # vertical stack\n", "\n", "h1 = np.ones((2,4))\n", "h2 = np.zeros((2,2))\n", "print(np.hstack((h1,h2))) # horizontal stack" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Vectorizing\n", "The following is a reasonably Pythonic way of plotting the $\\sin(x)$ without using `numpy`. (But this is bad.)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import math\n", "import matplotlib.pyplot as plt\n", "\n", "x = [i*(4*math.pi/(N-1)) for i in range(100)]\n", "y = [math.sin(x_i) for x_i in x]\n", "plt.plot(x, y)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is better to use `numpy` and avoid the use of loops or list comprehensions.\n", "With `numpy`, __vectorize__ operations as much as possible." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "x = np.linspace(0, 4*np.pi, 100)\n", "plt.plot(x, np.sin(x)) # math.sin(x) doesn't support vector eval\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you are iterating through a `numpy` array (with a for loop or list comprehension) there is a good chance you are doing something wrong.\n", "Vectorized code is shorter, faster, and usually more readible, so always look for ways to vectorize.\n", "\n", "\n", "(The principle of vectorization applies to `numpy` arrays, but the name __arrayrize__ doesn't roll off one's toungue.)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Broadcasting\n", "\n", "Arithmetic operations on arrays of same size are performed elementwise." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = np.arange(7)\n", "print(x * x) #not the inner product" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When we have arrays of different sizes, the smaller array is __broadcast__ across the larger array and then the arithmetic operations are carried out. (In some sense, broadcast generalizes the outer product of vectors.)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x, y = np.arange(5), np.arange(6)\n", "# print(x + y) # fail! dimension mismatch\n", "# print(x.reshape(-1,1) + y.reshape(1,-1)) # broadcasting\n", "\n", "# print(x.reshape(-1,1) * y.reshape(1,-1)) # outer product with broadcasting\n", "# print(np.outer(x,y)) # outer product with outer" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Scalar-array operations is the most common instance of broadcasting." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# print(5.5 + np.arange(5))\n", "\n", "print(3.5 * np.ones((3,3)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Indexing\n", "\n", "You can access elements of `numpy` arrays with __direct indexing__ and __slicing__, similar to how you access elements of lists. You also have __advanced indexing__ and __Boolean masks__. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "A = np.random.randn(10,10)\n", "print(A[4,5]) # direct indexing\n", "print(A[1:8:2,5:7]) # slicing" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__Be careful when copying numpy arrays!!!__\n", "\n", "For the sake of efficiency, `numpy` operations often avoid copying data and rather provides different __views__ of the underlying data." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = np.arange(5)\n", "y = x[:] # creates a different view, not a copy of x\n", "for i in range(5):\n", " y[i] = 0\n", "\n", "# z = x[2:4] # creates a different view, not a copy of x\n", "# z[:] = 7 # write broadcasted\n", "# print(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This behavior contrasts with that of lists." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = [0, 1, 2, 3, 4]\n", "y = x[:] # creates a copy of x\n", "for i in range(5):\n", " y[i] = 0\n", "print(x)\n", "\n", "# x[:] = 7 # no broadcasting for lists" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you really need to copy the data, be explicit by using `copy()`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = np.arange(5)\n", "y = x.copy() # creates a copy of x\n", "y[:] = 0\n", "print(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With __advanced indexing__, you pass in a list or `numpy` array of indices to access elements.\n", "(Advanced indexing doesn't work on Python lists.)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# x = np.arange(5)\n", "# print(x)\n", "# print(x[[1,4]])\n", "# print(x[1,4]) # doesn't work. Why?\n", "\n", "A = np.arange(24).reshape((4,-1))\n", "# print(A)\n", "perm = np.random.permutation(np.arange(A.shape[1]))\n", "print(perm)\n", "print(A[:,perm]) #randomly permute columns of x" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With __Boolean masks__, you pass in a list or `numpy` array of booleans of the same shape to access elements. (Boolean masks don't work on Python lists.)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.set_printoptions(formatter={'float': lambda x: \"{0:0.2f}\".format(x)})\n", "np.random.seed(1)\n", "\n", "x = np.random.randn(5)\n", "print(x)\n", "\n", "mask = (x >= 0)\n", "print(mask)\n", "print(x[mask])\n", "\n", "x[mask] = 0\n", "print(x)\n", "\n", "# x[x>=0] = 0\n", "# print(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We cannot directly use logical operators on Boolean masks. You must explicitly use NumPy's versions of the boolean operators." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.seed(1)\n", "x = 5*np.random.randn(5)\n", "\n", "mask1 = x <= 6\n", "mask2 = x >= 3\n", "print(mask1)\n", "print(mask2)\n", "print(np.logical_and(mask1, mask2))\n", "print(np.logical_xor(mask1, mask2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Linear Algebra\n", "\n", "Perform matrix multiplication with `@` rather than `*`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "\n", "n = 7\n", "A = np.ones((n,n))\n", "b = np.arange(n)\n", "# print(A*b) # broadcasted product. *Not* matrix-vector product.\n", "# print(A@b) # matrix-vector product\n", "\n", "# print(b*b) # element-wise product\n", "print(b@b) # dot product" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Transpose a matrix with `.transpose()` or `.T`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "A = np.ones((4,7))\n", "b = np.random.randn(7)\n", "\n", "print(A.T@A@b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `np.linalg` module provides linear algebraic functions." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "A = np.identity(3)\n", "print(np.linalg.det(A)) # determinant\n", "print(np.linalg.eigvals(A)) # eigenvalues" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Matplotlib and pyplot" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`matplotlib` and `pyplot` plot data contained in raw Python lists and `numpy` arrays.\n", "In its most basic form, a plot is a line sequentially connecting points in the 2D plane. \n", "\n", "To display plots on Jupyter notebooks, use the \"magic\" `%matplotlib inline`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "\n", "plt.plot([0,1,2,3],[5,9,3,2])\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can have multiple cuves on the same plot" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "\n", "plt.plot([0,1,2,3],[5,9,3,2])\n", "plt.plot([0,0.4,2.2,3],[-1,2,2,5])\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Manually choose the axis limits with `axis([xmin,xmax,ymin,ymax])`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "%matplotlib inline\n", "\n", "xx = np.linspace(-2,2,1024)\n", "plt.plot(xx,np.cos(xx))\n", "plt.plot(xx,np.exp(xx))\n", "\n", "plt.axis([-1.5,1.5,-1,3])\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Label your plots as follows." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "%matplotlib inline\n", "\n", "xx = np.linspace(-2,2,1024)\n", "plt.plot(xx,np.cos(xx))\n", "plt.plot(xx,np.exp(xx))\n", "\n", "plt.axis([-1.5,1.5,-1,3])\n", "\n", "plt.xlabel(\"Input values\")\n", "plt.ylabel(\"Function values\")\n", "plt.title(\"Plot title\")\n", "\n", "plt.legend([\"cos(x) funtion\", \"exp(x) function\"])\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is better to specify the legends via keyword arguments." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "%matplotlib inline\n", "\n", "xx = np.linspace(-2,2,1024)\n", "plt.plot(xx,np.cos(xx), label=\"cos(x) function\")\n", "plt.plot(xx,np.exp(xx), label=\"exp(x) function\")\n", "\n", "plt.axis([-1.5,1.5,-1,3])\n", "\n", "plt.xlabel(\"Input values\")\n", "plt.ylabel(\"Function values\")\n", "plt.title(\"Plot title\")\n", "\n", "plt.legend()\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can specify line styles with [\"format strings\"](https://matplotlib.org/3.2.1/api/_as_gen/matplotlib.pyplot.plot.html)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "\n", "# plt.plot([0,1,2,3],[5,9,3,2], 'r+') #red, no line, cross marker\n", "# plt.plot([0,0.4,2.2,3],[-1,2,2,5], 'b--o') #blue, -- line, circle marker\n", "\n", "\n", "plt.plot([0,1,2,3],[5,9,3,2],'r:')\n", "plt.plot([0,0.4,2.2,3],[-1,2,2,5],'k-.')\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "While format strings are concise and \"standard\", I don't think they are very readable. I prefer using keyword arguments. You can specify colors with their names or their RGB hex code." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "\n", "plt.plot([0,1,2,3],[5,9,3,2], color='#0F0F70', linestyle='--')\n", "plt.plot([0,0.4,2.2,3],[2,-1,-2,3], color=\"green\", linestyle=':', marker='p')\n", "plt.plot([0,0.4,2.2,3],[-1,2,2,5], color=\"#dcdab2\", linestyle='-', marker='o')\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can specify other plot properties with keyword arguments." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "\n", "plt.plot([0,1,2,3],[5,9,3,2], color='#0F0F70', linestyle='--',\\\n", " linewidth=4)\n", "plt.plot([0,0.4,2.2,3],[2,-1,-2,3], color=\"green\", linestyle=':', marker='p',\\\n", " linewidth=4, markersize=15)\n", "plt.plot([0,0.4,2.2,3],[-1,2,2,5], color=\"#dcdab2\", linestyle='-', marker='o',\\\n", " linewidth=4, markersize=15)\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Lines are layered in the order they are added." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "%matplotlib inline\n", "\n", "xx = np.linspace(-2,2,1024)\n", "plt.plot(xx,np.exp(xx), color='red', linewidth=7) #order matters\n", "plt.plot(xx,np.cos(xx), color='blue', linewidth=7) #order matters\n", "\n", "plt.axis([-1.5,1.5,-1,3])\n", "\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can change font settings with `plt.rc`.\n", "(\"rc\" is a standard abbreviation in programming for \"runtime configuration\".)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "%matplotlib inline\n", "\n", "plt.rc('text', usetex=True)\n", "plt.rc('font', family='serif')\n", "plt.rc('font', size = 16)\n", "\n", "xx = np.linspace(-2,2,1024)\n", "plt.plot(xx,np.sin(xx), label=\"$\\sin(x)$ function\")\n", "plt.plot(xx,np.sin(xx)**2, label=\"$\\sin^2(x)$ function\")\n", "\n", "plt.xlabel(\"$x$-coordinate\")\n", "plt.ylabel(\"$y$-coordinate\")\n", "plt.title(\"Title includes LaTeX $\\|A^Tx-b\\|^2$\")\n", "\n", "plt.legend()\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To return all `rc` settings to default, use:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.rcdefaults()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`plt.grid()` creates a grid in the background." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "%matplotlib inline\n", "\n", "xx = np.linspace(-2,2,1024)\n", "plt.plot(xx,np.sin(xx))\n", "plt.plot(xx,np.sin(xx)**2)\n", "\n", "\n", "plt.grid()\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you are unhappy with the default style but do not want to spend time customizing your plots, use one of the available styles." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "%matplotlib inline\n", "\n", "print(plt.style.available) #list of available styles\n", "plt.rcdefaults()\n", "plt.style.use('fivethirtyeight') #use style ggplot (\"gg\" stands for Leland Wilkinson's \"Grammar of Graphics\")\n", "\n", "xx = np.linspace(-2,2,1024)\n", "plt.plot(xx,np.sin(xx), label=\"some function\")\n", "plt.plot(xx,np.sin(xx)**2, label=\"another function\")\n", "\n", "plt.xlabel(\"Input\")\n", "plt.ylabel(\"Output\")\n", "plt.title(\"Title stuff\")\n", "\n", "plt.legend()\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Save your figure as an image file using `plt.savefig(...)`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "%matplotlib inline\n", "\n", "plt.rcdefaults()\n", "\n", "xx = np.linspace(-2,2,1024)\n", "plt.plot(xx,np.sin(xx), label=\"some function\")\n", "plt.plot(xx,np.sin(xx)**2, label=\"another function\")\n", "\n", "plt.xlabel(\"Input\")\n", "plt.ylabel(\"Output\")\n", "plt.title(\"Title stuff\")\n", "\n", "plt.legend()\n", "\n", "plt.savefig('plot.png')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# PyTorch\n", "\n", "__PyTorch__ is a machine learning library of Python. \n", "\n", "PyTorch is fundamentally a numerical computation library, and it shares a lot of similarities with NumPy.\n", "\n", "Key differences that make PyTorch suitable for using neural networks and machine learning.\n", "1. PyTorch supports easy GPU computation.\n", "2. Automatic differentiation.\n", "3. Numerous ML libraries and sample code.\n", "\n", "Think of it as a replacement for NumPy." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In PyTorch, people say _tensor_ rather than _array_." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import torch\n", "print(\"torch version:\", torch.__version__)\n", "\n", "print('\\nCreate a zero ndarray in NumPy:')\n", "zero_np = np.zeros([2, 3])\n", "print(zero_np)\n", "print('\\nCreate a zero tensor in PyTorch:')\n", "zero_pt = torch.zeros([2,3])\n", "print(zero_pt)\n", "\n", "print(zero_np.shape)\n", "print(zero_pt.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can index elements of PyTorch tensors as you index elements of NumPy arrays." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(zero_np[0,1])\n", "print(zero_pt[0,1])\n", "print(zero_pt[0,1].item()) #convert scalar tensors into regular number" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A ndarray can be converted into a tensor, and vice versa." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "zero_pt_from_np = torch.tensor(zero_np)\n", "print(zero_pt_from_np)\n", "\n", "zero_np_from_pt = zero_pt.numpy()\n", "print(zero_np_from_pt)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The _rank_ of a tensor is the number of dimensions." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "print(len(zero_pt.shape))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use `numel()` to obtain the total number of elements. (Unlike in Numpy, `size()` is the same as `shape`.)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(zero_pt.numel())\n", "\n", "print(zero_pt.shape)\n", "\n", "print(zero_np.size)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reshaping PyTorch Tensors" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "t = torch.tensor([\n", " [1,1,1,1],\n", " [2,2,2,2],\n", " [3,3,3,3]\n", "], dtype=torch.float32)\n", "\n", "print(t)\n", "\n", "t = t.reshape(-1,1)\n", "print(t)\n", "print(t.shape) #rank preserved\n", "\n", "t = t.reshape(2,-1,3)\n", "print(t)\n", "print(t.shape) #rank changed" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "_Squeezing_ removed dimensions with length 1 and _unsqueezing_ adds a dimension with length 1." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# s = t.reshape(12,1).squeeze()\n", "# print(s)\n", "# print(s.shape) #rank reduced\n", "\n", "s = t.reshape(12,1).unsqueeze(dim=0)\n", "print(s)\n", "print(s.shape) #rank reduced" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Tensors often have the structure (batch)x(data).\n", "\n", "If the data is a 2D color image, you have the 4D tensor of (batch)x(RGB channel)x(x-axis)x(y-axis)." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 4 }